Dataset pipelines by ankrgyl · Pull Request #163 · braintrustdata/bt

Ankur Goyal (ankrgyl) · 2026-04-30T21:02:23Z

No description provided.

github-actions · 2026-04-30T21:14:39Z

Latest downloadable build artifacts for this PR commit d232b9e8dcd4:

Workflow run: https://github.com/braintrustdata/bt/actions/runs/26315354441
Download all artifacts (GitHub CLI): gh run download 26315354441 --repo braintrustdata/bt
Installers are published from main automatically. To publish one for a PR branch, run release-canary manually via workflow_dispatch.

Available artifact names

``artifacts-build-global
``artifacts-build-local-x86_64-pc-windows-msvc
``artifacts-build-local-x86_64-apple-darwin
``artifacts-build-local-aarch64-pc-windows-msvc
``artifacts-build-local-x86_64-unknown-linux-musl
``artifacts-build-local-x86_64-unknown-linux-gnu
``artifacts-build-local-aarch64-apple-darwin
``artifacts-build-local-aarch64-unknown-linux-gnu
``artifacts-plan-dist-manifest
``cargo-dist-cache

Abhijeet Prasad (AbhiPrasad) · 2026-05-22T19:21:24Z

+bt datasets pipeline run ./pipeline.ts --limit 100
+
+# Staged execution for inspection or agent editing.
+bt datasets pipeline fetch ./pipeline.ts --limit 500


why is this called fetch here, but the enum is called pull? Can we make the naming more consistent?

good catch, at some point i renamed from fetch to pull and there were some lingering references.

Abhijeet Prasad (AbhiPrasad) · 2026-05-22T19:36:17Z

+    Ok((ctx, client, project))
+}
+
+fn discovery_filter(


should we add a timestamp filter of some kind? Just to constrain the queries here?

added a --window argument which defaults to 1d and is always and'd in.

Abhijeet Prasad (AbhiPrasad) · 2026-05-22T19:48:01Z

+    /// Maximum number of source refs to discover
+    #[arg(
+        long,
+        alias = "target",


I would prefer if we didn't have this alias. limit aligns better with the rest of the CLI options, and it might be confusing that this is referring to source refs, and not final row count (while things like --target-dataset refer to the output).

Abhijeet Prasad (AbhiPrasad) · 2026-05-22T19:50:51Z

+        );
+    }
+
+    datasets_api::create_dataset_with_metadata(


auto creating datasets like this might be surprising behaviour. Means that folks run into issues if they make a spelling mistake or something similar. But I don't feel strongly about this, just figured being explicit about it might be better for the agents.

i generally agree but our SDKs auto create projects and datasets so i think as-is, this is more consistent with our current semantics.

clean up code

1ecc608

Ankur Goyal (ankrgyl) added 7 commits April 30, 2026 19:38

Merge branch 'main' into dataset-pipelines

7412e60

consolidate

52ebd4b

rm

5c39933

consolidate some code

9463d03

more progress

3cbce32

print destination URL

bbb3689

a few more fixes

ff6d0aa

Ankur Goyal (ankrgyl) changed the title ~~wip: dataset pipelines~~ Dataset pipelines May 3, 2026

Ankur Goyal (ankrgyl) added 7 commits May 3, 2026 16:48

fix non unix

15db7da

Merge branch 'main' into dataset-pipelines

e76e807

fix build

ae5d99a

stubbed json attachmet

b6ebc93

shim typescript too

55b5225

Merge branch 'main' into dataset-pipelines

e34a717

propagate origin field properly

b5428f7

Ankur Goyal (ankrgyl) requested a review from Abhijeet Prasad (AbhiPrasad) May 21, 2026 21:56

Merge branch 'main' into dataset-pipelines

c5de9aa

Abhijeet Prasad (AbhiPrasad) reviewed May 22, 2026

View reviewed changes

comments

d232b9e

Abhijeet Prasad (AbhiPrasad) approved these changes May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset pipelines#163

Dataset pipelines#163
Ankur Goyal (ankrgyl) wants to merge 17 commits into
mainfrom
dataset-pipelines

Ankur Goyal (ankrgyl) commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

Abhijeet Prasad (AbhiPrasad) May 22, 2026

Uh oh!

Ankur Goyal (ankrgyl) May 22, 2026

Uh oh!

Abhijeet Prasad (AbhiPrasad) May 22, 2026

Uh oh!

Ankur Goyal (ankrgyl) May 22, 2026

Uh oh!

Ankur Goyal (ankrgyl) May 22, 2026

Uh oh!

Abhijeet Prasad (AbhiPrasad) May 22, 2026

Uh oh!

Ankur Goyal (ankrgyl) May 22, 2026

Uh oh!

Abhijeet Prasad (AbhiPrasad) May 22, 2026

Uh oh!

Ankur Goyal (ankrgyl) May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ankur Goyal (ankrgyl) commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Apr 30, 2026 •

edited

Loading